Fast transcription of unstructured audio recordings
نویسندگان
چکیده
We introduce a new method for human-machine collaborative speech transcription that is significantly faster than existing transcription methods. In this approach, automatic audio processing algorithms are used to robustly detect speech in audio recordings and split speech into short, easy to transcribe segments. Sequences of speech segments are loaded into a transcription interface that enables a human transcriber to simply listen and type, obviating the need for manually finding and segmenting speech or explicitly controlling audio playback. As a result, playback stays synchronized to the transcriber’s speed of transcription. In evaluations using naturalistic audio recordings made in everyday home situations, the new method is up to 6 times faster than other popular transcription tools while preserving transcription quality.
منابع مشابه
Does the recording medium influence phonetic transcription of cleft palate speech?
BACKGROUND In recent years, analyses of cleft palate speech based on phonetic transcriptions have become common. However, the results vary considerably among different studies. It cannot be excluded that differences in assessment methodology, including the recording medium, influence the results. AIMS To compare phonetic transcriptions from audio and audio/video recordings of cleft palate spe...
متن کاملEcological Acoustics Perspective for Content-Based Retrieval of Environmental Sounds
In this paper we present a method to search for environmental sounds in large unstructured databases of user-submitted audio, using a general sound events taxonomy from ecological acoustics. We discuss the use of Support Vector Machines to classify sound recordings according to the taxonomy and describe two use cases for the obtained classificationmodels: a content-based web search interface fo...
متن کاملTeachers’ Strategies Used to Foster Teacher-Student and Student-Student Interactions in EFL Conversation Classrooms: A Conversation Analysis Approach
Despite the fact that there are a wide range of strategies used to foster interactions in EFL conversation classrooms, many novice teachers are not aware of them. In view of this problem, the current study aimed to identify such strategies commonly used by EFL teachers in conversation classrooms. To this end, fifty sessions of college level conversation classrooms were observed andtheir teacher...
متن کاملAutomated quantisation and transcription of Ornaments from audio recordings
We propose a new method for rhythm quantisation and measurement of expressive timing. This paper focuses on the automatic quantisation and rhythmic transcription of syncopated rhythms and baroque ornaments, e.g. appogiaturas, mordants and trills from time-tagged audio recordings without knowing the score in advance. We demonstrate the transcription of the Aria of J. S. Bach’s Goldberg Variation...
متن کاملEvaluation of Two Mobile Nutrition Tracking Applications for Chronically Ill Populations with Low Literacy Skills
In this chapter, we discuss two case studies that compared and contrasted the use of barcode scanning, voice recording, and patient self reporting as a means to monitor the nutritional intake of a chronically ill population. In the first study, we found that participants preferred unstructured voice recordings rather than barcode scanning. Since unstructured voice recordings require costly tran...
متن کامل